Synthetic Generation of Multidimensional Data to Improve Classification Model Validity
نویسندگان
چکیده
This paper aims to compare Generative Adversarial Network (GAN) models and feature selection methods for generating synthetic data in order improve the validity of a classification model. The generation technique involves new samples from existing increase diversity help model generalize better. multidimensional aspect refers fact that it can have multiple features or variables describe it. GAN proven be effective preserving statistical properties original data. However, augmentation is crucial build robust accurate predictive models. By comparing different with on multi-dimensional datasets, this determine best combination support Data
منابع مشابه
the innovation of a statistical model to estimate dependable rainfall (dr) and develop it for determination and classification of drought and wet years of iran
آب حاصل از بارش منبع تأمین نیازهای بی شمار جانداران به ویژه انسان است و هرگونه کاهش در کم و کیف آن مستقیماً حیات موجودات زنده را تحت تأثیر منفی قرار می دهد. نوسان سال به سال بارش از ویژگی های اساسی و بسیار مهم بارش های سالانه ایران محسوب می شود که آثار زیان بار آن در تمام عرصه های اقتصادی، اجتماعی و حتی سیاسی- امنیتی به نحوی منعکس می شود. چون میزان آب ناشی از بارش یکی از مولفه های اصلی برنامه ...
15 صفحه اولSynthetic data generation for classification via uni-modal cluster interpolation
The observations used to classify data from real systems often vary as a result of changing operating conditions (e.g. velocity, load, temperature, etc.). Hence, to create accurate classification algorithms for these systems, observations from a large number of operating conditions must be used in algorithm training. This can be an arduous, expensive, and even dangerous task. Treating an operat...
متن کاملMultinomial Dirichlet Gaussian Process Model for Classification of Multidimensional Data
We present probabilistic multinomial Dirichlet classification model for multidimensional data and Gaussian process priors. Here, we have considered efficient computational method that can be used to obtain the approximate posteriors for latent variables and parameters needed to define the multiclass Gaussian process classification model. We first investigated the process of inducing a posterior...
متن کاملA Novel Approach to Model Generation for Heterogeneous Data Classification
Ensemble methods such as bagging and boosting have been successfully applied to classification problems. Two important issues associated with an ensemble approach are: how to generate models to construct an ensemble, and how to combine them for classification. In this paper, we focus on the problem of model generation for heterogeneous data classification. If we could partition heterogeneous da...
متن کاملSynthetic Spotlight Sar Image Generation to Improve Geopositioning Accuracy
Many applications require the ability to use airborne sensor data to perform accurate geopositioning. Synthetic Aperture Radar’s (SAR) ability to image from long distances and in poor weather makes it advantageous for such geopositioning. However, the ability to accurately position using SAR data is reliant on having accurate sensor support data (sensor position, velocity, etc...). This paper e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Data and Information Quality
سال: 2023
ISSN: ['1936-1963', '1936-1955']
DOI: https://doi.org/10.1145/3603715